24 research outputs found

    XML data clustering: An overview

    Get PDF
    In the last few years we have observed a proliferation of approaches for clustering XML docu- ments and schemas based on their structure and content. The presence of such a huge amount of approaches is due to the different applications requiring the XML data to be clustered. These applications need data in the form of similar contents, tags, paths, structures and semantics. In this paper, we first outline the application contexts in which clustering is useful, then we survey approaches so far proposed relying on the abstract representation of data (instances or schema), on the identified similarity measure, and on the clustering algorithm. This presentation leads to draw a taxonomy in which the current approaches can be classified and compared. We aim at introducing an integrated view that is useful when comparing XML data clustering approaches, when developing a new clustering algorithm, and when implementing an XML clustering compo- nent. Finally, the paper moves into the description of future trends and research issues that still need to be faced

    Semantic Web Datatype Similarity: Towards Better RDF Document Matching

    No full text
    International audienceWith the advance of the Semantic Web, the need to integrate and combine data from different sources has increased considerably. Many efforts have focused on RDF document matching. However, they present limited approaches in the context of datatype similarity. This paper addresses the issue of datatype similarity for the Semantic Web as a first step towards a better RDF document matching. We propose a datatype hierarchy, based on W3C's XSD datatype hierarchy, that better captures the subsumption relationship among primitive and derived datatypes. We also propose a new datatype similarity measure, that takes into consideration several aspects related to the new hierarchical relations between compared datatypes. Our experiments show that the new similarity measure, along with the new hierarchy, produces better results (closer to what a human expert would think about the similarity of compared datatypes) than the ones described in the literature. \textcopyright 2017, Springer International Publishing AG

    XML schema element similarity measures: a schema matching context

    Get PDF
    In this paper, we classify, review, and experimentally compare major methods that are exploited in the definition, adoption, and utilization of element similarity measures in the context of XML schema matching. We aim at presenting a unified view which is useful when developing a new element similarity measure, when implementing an XML schema matching component, when using an XML schema matching system, and when comparing XML schema matching systems
    corecore